biological system
A Spectral Theory of Neural Prediction and Alignment
The representations of neural networks are often compared to those of biological systems by performing regression between the neural network responses and those measured from biological systems. Many different state-of-the-art deep neural networks yield similar neural predictions, but it remains unclear how to differentiate among models that perform equally well at predicting neural responses. To gain insight into this, we use a recent theoretical framework that relates the generalization error from regression to the spectral properties of the model and the target. We apply this theory to the case of regression between model activations and neural responses and decompose the neural prediction error in terms of the model eigenspectra, alignment of model eigenvectors and neural responses, and the training set size. Using this decomposition, we introduce geometrical measures to interpret the neural prediction error. We test a large number of deep neural networks that predict visual cortical activity and show that there are multiple types of geometries that result in low neural prediction error as measured via regression. The work demonstrates that carefully decomposing representational metrics can provide interpretability of how models are capturing neural activity and points the way towards improved models of neural activity.
Simulation-based Methods for Optimal Sampling Design in Systems Biology
Ha, Tuan Minh, Nguyen, Binh Thanh, Ho, Lam Si Tung
In many areas of systems biology, including virology, pharmacokinetics, and population biology, dynamical systems are commonly used to describe biological processes. These systems can be characterized by estimating their parameters from sampled data. The key problem is how to optimally select sampling points to achieve accurate parameter estimation. Classical approaches often rely on Fisher information matrix-based criteria such as A-, D-, and E-optimality, which require an initial parameter estimate and may yield suboptimal results when the estimate is inaccurate. This study proposes two simulation-based methods for optimal sampling design that do not depend on initial parameter estimates. The first method, E-optimal-ranking (EOR), employs the E-optimal criterion, while the second utilizes a Long Short-Term Memory (LSTM) neural network. Simulation studies based on the Lotka-Volterra and three-compartment models demonstrate that the proposed methods outperform both random selection and classical E-optimal design.
- Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.14)
- North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Position: Biology is the Challenge Physics-Informed ML Needs to Evolve
Physics-Informed Machine Learning (PIML) has successfully integrated mechanistic understanding into machine learning, particularly in domains governed by well-known physical laws. This success has motivated efforts to apply PIML to biology, a field rich in dynamical systems but shaped by different constraints. Biological modeling, however, presents unique challenges: multi-faceted and uncertain prior knowledge, heterogeneous and noisy data, partial observability, and complex, high-dimensional networks. In this position paper, we argue that these challenges should not be seen as obstacles to PIML, but as catalysts for its evolution. We propose Biology-Informed Machine Learning (BIML): a principled extension of PIML that retains its structural grounding while adapting to the practical realities of biology. Rather than replacing PIML, BIML retools its methods to operate under softer, probabilistic forms of prior knowledge. We outline four foundational pillars as a roadmap for this transition: uncertainty quantification, contextualization, constrained latent structure inference, and scalability. Foundation Models and Large Language Models will be key enablers, bridging human expertise with computational modeling. We conclude with concrete recommendations to build the BIML ecosystem and channel PIML-inspired innovation toward challenges of high scientific and societal relevance.
- North America > Canada > Alberta > Census Division No. 8 > Red Deer County (0.24)
- North America > Canada > Alberta > Census Division No. 7 > Stettler County No. 6 (0.24)
- North America > Canada > Alberta > Census Division No. 5 > Starland County (0.24)
- (4 more...)
- Research Report (0.64)
- Overview (0.46)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.93)
- Energy > Power Industry (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- (2 more...)
Simulation-free Structure Learning for Stochastic Dynamics
Rimawi-Fine, Noah El, Stecklov, Adam, Nelson, Lucas, Blanchette, Mathieu, Tong, Alexander, Zhang, Stephen Y., Atanackovic, Lazar
Modeling dynamical systems and unraveling their underlying causal relationships is central to many domains in the natural sciences. Various physical systems, such as those arising in cell biology, are inherently high-dimensional and stochastic in nature, and admit only partial, noisy state measurements. This poses a significant challenge for addressing the problems of modeling the underlying dynamics and inferring the network structure of these systems. Existing methods are typically tailored either for structure learning or modeling dynamics at the population level, but are limited in their ability to address both problems together. In this work, we address both problems simultaneously: we present StructureFlow, a novel and principled simulation-free approach for jointly learning the structure and stochastic population dynamics of physical systems. We showcase the utility of StructureFlow for the tasks of structure learning from interventions and dynamical (trajectory) inference of conditional population dynamics. We empirically evaluate our approach on high-dimensional synthetic systems, a set of biologically plausible simulated systems, and an experimental single-cell dataset. We show that StructureFlow can learn the structure of underlying systems while simultaneously modeling their conditional population dynamics -- a key step toward the mechanistic understanding of systems behavior.
- North America > Canada > Quebec > Montreal (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)
STL-based Optimization of Biomolecular Neural Networks for Regression and Control
Palanques-Tost, Eric, Krasowski, Hanna, Arcak, Murat, Weiss, Ron, Belta, Calin
Biomolecular Neural Networks (BNNs), artificial neural networks with biologically synthesizable architectures, achieve universal function approximation capabilities beyond simple biological circuits. However, training BNNs remains challenging due to the lack of target data. To address this, we propose leveraging Signal Temporal Logic (STL) specifications to define training objectives for BNNs. We build on the quantitative semantics of STL, enabling gradient-based optimization of the BNN weights, and introduce a learning algorithm that enables BNNs to perform regression and control tasks in biological systems. Specifically, we investigate two regression problems in which we train BNNs to act as reporters of dysregulated states, and a feedback control problem in which we train the BNN in closed-loop with a chronic disease model, learning to reduce inflammation while avoiding adverse responses to external infections. Our numerical experiments demonstrate that STL-based learning can solve the investigated regression and control tasks efficiently.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- North America > United States > California > Alameda County > Berkeley (0.14)
- (2 more...)
Data-driven Discovery of Digital Twins in Biomedical Research
Métayer, Clémence, Ballesta, Annabelle, Martinelli, Julien
Recent technological advances have expanded the availability of high-throughput biological datasets, enabling the reliable design of digital twins of biomedical systems or patients. Such computational tools represent key reaction networks driving perturbation or drug response and can guide drug discovery and personalized therapeutics. Yet, their development still relies on laborious data integration by the human modeler, so that automated approaches are critically needed. The success of data-driven system discovery in Physics, rooted in clean datasets and well-defined governing laws, has fueled interest in applying similar techniques in Biology, which presents unique challenges. Here, we reviewed methodologies for automatically inferring digital twins from biological time series, which mostly involve symbolic or sparse regression. We evaluate algorithms according to eight biological and methodological challenges, associated to noisy/incomplete data, multiple conditions, prior knowledge integration, latent variables, high dimensionality, unobserved variable derivatives, candidate library design, and uncertainty quantification. Upon these criteria, sparse regression generally outperformed symbolic regression, particularly when using Bayesian frameworks. We further highlight the emerging role of deep learning and large language models, which enable innovative prior knowledge integration, though the reliability and consistency of such approaches must be improved. While no single method addresses all challenges, we argue that progress in learning digital twins will come from hybrid and modular frameworks combining chemical reaction network-based mechanistic grounding, Bayesian uncertainty quantification, and the generative and knowledge integration capacities of deep learning. To support their development, we further propose a benchmarking framework to evaluate methods across all challenges.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > France (0.04)
- (3 more...)
- Research Report (0.81)
- Workflow (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- (5 more...)
Self-Organizing Survival Manifolds: A Theory for Unsupervised Discovery of Prognostic Structures in Biological Systems
Survival is traditionally modeled as a supervised learning task, reliant on curated outcome labels and fixed covariates. This work rejects that premise. It proposes that survival is not an externally annotated target but a geometric consequence: an emergent property of the curvature and flow inherent in biological state space. We develop a theory of Self-Organizing Survival Manifolds (SOSM), in which survival-relevant dynamics arise from low-curvature geodesic flows on latent manifolds shaped by internal biological constraints. A survival energy functional based on geodesic curvature minimization is introduced and shown to induce structures where prognosis aligns with geometric flow stability. We derive discrete and continuous formulations of the objective and prove theoretical results demonstrating the emergence and convergence of survival-aligned trajectories under biologically plausible conditions. The framework draws connections to thermodynamic efficiency, entropy flow, Ricci curvature, and optimal transport, grounding survival modeling in physical law. Health, disease, aging, and death are reframed as geometric phase transitions in the manifold's structure. This theory offers a universal, label-free foundation for modeling survival as a property of form, not annotation-bridging machine learning, biophysics, and the geometry of life itself.
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- Asia > Japan (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
Geometric Learning Dynamics
We present a unified geometric framework for modeling learning dynamics in physical, biological, and machine learning systems. The theory reveals three fundamental regimes, each emerging from the power-law relationship $g \propto κ^α$ between the metric tensor $g$ in the space of trainable variables and the noise covariance matrix $κ$. The quantum regime corresponds to $α= 1$ and describes Schrödinger-like dynamics that emerges from a discrete shift symmetry. The efficient learning regime corresponds to $α= \tfrac{1}{2}$ and describes very fast machine learning algorithms. The equilibration regime corresponds to $α= 0$ and describes classical models of biological evolution. We argue that the emergence of the intermediate regime $α= \tfrac{1}{2}$ is a key mechanism underlying the emergence of biological complexity.
- North America > United States > New York (0.04)
- North America > United States > Minnesota > St. Louis County > Duluth (0.04)
- North America > United States > Minnesota > Saint Louis County > Duluth (0.04)
- (2 more...)
Measuring Scientific Capabilities of Language Models with a Systems Biology Dry Lab
Duan, Haonan, Lu, Stephen Zhewen, Harrigan, Caitlin Fiona, Desai, Nishkrit, Lu, Jiarui, Koziarski, Michał, Cotta, Leonardo, Maddison, Chris J.
Designing experiments and result interpretations are core scientific competencies, particularly in biology, where researchers perturb complex systems to uncover the underlying systems. Recent efforts to evaluate the scientific capabilities of large language models (LLMs) fail to test these competencies because wet-lab experimentation is prohibitively expensive: in expertise, time and equipment. We introduce SciGym, a first-in-class benchmark that assesses LLMs' iterative experiment design and analysis abilities in open-ended scientific discovery tasks. SciGym overcomes the challenge of wet-lab costs by running a dry lab of biological systems. These models, encoded in Systems Biology Markup Language, are efficient for generating simulated data, making them ideal testbeds for experimentation on realistically complex systems. We evaluated six frontier LLMs on 137 small systems, and released a total of 350 systems. Our evaluation shows that while more capable models demonstrated superior performance, all models' performance declined significantly as system complexity increased, suggesting substantial room for improvement in the scientific capabilities of LLM agents.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Asia > Middle East > Saudi Arabia > Asir Province > Abha (0.04)
- Asia > Middle East > Jordan (0.04)
Future of Code with Generative AI: Transparency and Safety in the Era of AI Generated Software
Future of Code with Generative AI: Transparency and Safety in the Era of AI-Generated Software By David Hanson Ph.D. Abstract As artificial intelligence (AI) becomes increasingly integrated into software development processes, the prevalence and sophistication of AI-generated code continue to expand rapidly. This study addresses the critical need for transparency and safety in AI-generated code by examining the current landscape, identifying potential risks, and exploring future implications. We analyze market opportunities for detecting AI-generated code, discuss the challenges associated with managing increasing complexity, and propose solutions to enhance transparency and functionality analysis. Furthermore, this study investigates the long-term implications of AI-generated code, including its potential role in the development of artificial general intelligence and its impact on human-AI interaction. In conclusion, we emphasize the importance of proactive measures for ensuring the responsible development and deployment of AI in software engineering. Introduction The integration of artificial intelligence (AI) into software development processes marks a pivotal development in the evolution of computer programming. As AI-generated code becomes increasingly prevalent and sophisticated, it introduces both significant opportunities and complex challenges for software engineering.
- North America > United States (0.14)
- Europe (0.14)
- Law (1.00)
- Government (1.00)
- Health & Medicine (0.69)
- Information Technology > Security & Privacy (0.69)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.96)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)